Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification
نویسندگان
چکیده
Mel-frequency cepstral coefficients (MFCCs) are the most popular features for automatic audio classification (AAC). However, MFCCs are often not robust in adverse environment. In this paper, a minimum classification error (MCE)-based method is proposed to extract new and robust spectro-temporal features as alternatives to MFCCs. The robustness of the proposed new features is evaluated on noisy non-speech sound of RWCP Sound Scene Database in Real Acoustic Environment database with Aurora 2 multi-condition training task-like settings. Experimental results show the proposed new features achieved the lowest average recognition error rate of 3.17% which is much better than state-of-the-art MFCCs plus mean subtraction, variance normalization and ARMA filtering (MFCC+MVA, 4.31%), Gabor filters with principle component analysis (Gabor+PCA, 4.43%) and linear discriminant analysis (LDA, 4.20%) features. We thus confirm the robustness of the proposed spectro-temporal feature extraction approach.
منابع مشابه
Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملUsing Spectro-Temporal Features for Environmental Sounds Recognition
The paper presents the task of recognizing environmental sounds for audio surveillance and security applications. A various characteristics have been proposed for audio classification, including the popular Mel-frequency cepstral coefficients (MFCCs) which give a description of the audio spectral shape. However, it exist some temporal-domain features. These last have been developed to character...
متن کاملراهکار جدید استخراج ویژگی مبتنی بر نمونهبرداری فشرده در پردازش سیگنالهای صوتی
In this paper, we present a Compressive Sampling (CS)-based feature extraction method for audio signals. In the proposed approach, the audio signal is firstly segmented by hamming windows and the Discrete Fourier Transform (DFT) of the samples is calculated within each frame. Then, the normalized values of the DFT coefficients of each frame are accumulated. At the next step, the second DFT is a...
متن کاملNonnegative features of spectro-temporal sounds for classification
A parts-based representation is a way of understanding object recognition in the brain. The nonnegative matrix factorization (NMF) is an algorithm which is able to learn a parts-based representation by allowing only non-subtractive combinations (Lee and Seung, 1999). In this paper we incorporate a parts-based representation of spectro-temporal sounds into the acoustic feature extraction, which ...
متن کاملNeural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کامل